A Compiler-Based Approach to Schema-Specific XML Parsing
نویسندگان
چکیده
The validation of XML instances against a schema is usually performed separately from the parsing of the more basic syntactic aspects of XML. We posit, however, that schema information can be used during parsing to improve performance, using what we call schema-specific parsing. This paper develops a framework for schema-specific parsing centered on an intermediate representation we call generalized automata, which abstracts the computational steps necessary to validate against a schema. The generalized automata can then be used to generate optimized code which might be onerous to write manually. We present results that suggest this is a viable approach to high-performance XML parsing.
منابع مشابه
A Compiler-Based Approach to Schema-Specific Parsers for XML
The Extensible Markup Language (XML) has become the de facto standard for interoperable data representation. Its human-readable, general syntax provides wide applicability and ease-of-use. These same characteristics, however, complicate the efficient processing of XML, and have created concerns about the performance of XML for distributed systems such as Web services. XML parsers are generally ...
متن کاملConstructing Finite State Automata for High-Performance XML Web Services
This paper describes a validating XML parsing method based on deterministic finite state automata (DFA). XML parsing and validation is performed by a schema-specific XML parser that encodes the admissible parsing states as a DFA. This DFA is automatically constructed from the XML schemas of XML messages using a code generator. A twolevel DFA architecture is used to increase efficiency and to re...
متن کاملToward Remote Object Coherence with Compiled Object Serialization for Distributed Computing with XML Web Services
Cross-platform object-level coherence in Web services-based distributed systems and grids requires lossless serialization to ensure programming-language specific objects are safely transmitted, manipulated, and stored. However, Web services development tools often suffer from lossy forms of XML serialization, which diminishes the usefulness of XML Web services as a competitive approach to binar...
متن کاملData-Binding Facility for the Java Platform
Sun Microsystems has recently undertaken to provide basic support for XML in the Java Platform. The proposed facilities include both an event-driven, SAX-compliant parser and an implementation of the W3C DOM (Document Object Model) parse-tree API. This is a critical first step, but using these fairly low-level APIs does require a moderately sophisticated understanding of XML. In order to make X...
متن کاملJavaML 2.0: Enriching the Markup Language for Java Source Code
Although the representation of source code in plain text format is convenient for manipulation by programmers, it is not an effective format for processing by software engineering tools at an abstraction level suitable for source code analysis, reverse-engineering, or refactoring. Textual source code files require language-specific parsing to uncover program structure, a task undertaken by all ...
متن کامل